Search CORE

784 research outputs found

Solving Multiclass Learning Problems via Error-Correcting Output Codes

Author: Bakiri G.
Dietterich T. G.
Publication venue
Publication date: 31/12/1994
Field of study

Multiclass learning problems involve finding a definition for an unknown function f(x) whose range is a discrete set containing k &gt 2 values (i.e., k ``classes''). The definition is acquired by studying collections of training examples of the form [x_i, f (x_i)]. Existing approaches to multiclass learning problems include direct application of multiclass algorithms such as the decision-tree algorithms C4.5 and CART, application of binary concept learning algorithms to learn individual binary functions for each of the k classes, and application of binary concept learning algorithms with distributed output representations. This paper compares these three approaches to a new technique in which error-correcting codes are employed as a distributed output representation. We show that these output representations improve the generalization performance of both C4.5 and backpropagation on a wide range of multiclass learning tasks. We also demonstrate that this approach is robust with respect to changes in the size of the training sample, the assignment of distributed representations to particular classes, and the application of overfitting avoidance techniques such as decision-tree pruning. Finally, we show that---like the other methods---the error-correcting code technique can provide reliable class probability estimates. Taken together, these results demonstrate that error-correcting output codes provide a general-purpose method for improving the performance of inductive learning programs on multiclass problems.Comment: See http://www.jair.org/ for any accompanying file

arXiv.org e-Print Archive

CiteSeerX

Integrating Learning from Examples into the Search for Diagnostic Policies

Author: Bayer-Zubek V.
Dietterich T. G.
Publication venue: 'AI Access Foundation'
Publication date: 09/09/2011
Field of study

This paper studies the problem of learning diagnostic policies from training examples. A diagnostic policy is a complete description of the decision-making actions of a diagnostician (i.e., tests followed by a diagnostic decision) for all possible combinations of test results. An optimal diagnostic policy is one that minimizes the expected total cost, which is the sum of measurement costs and misdiagnosis costs. In most diagnostic settings, there is a tradeoff between these two kinds of costs. This paper formalizes diagnostic decision making as a Markov Decision Process (MDP). The paper introduces a new family of systematic search algorithms based on the AO* algorithm to solve this MDP. To make AO* efficient, the paper describes an admissible heuristic that enables AO* to prune large parts of the search space. The paper also introduces several greedy algorithms including some improvements over previously-published methods. The paper then addresses the question of learning diagnostic policies from examples. When the probabilities of diseases and test results are computed from training data, there is a great danger of overfitting. To reduce overfitting, regularizers are integrated into the search algorithms. Finally, the paper compares the proposed methods on five benchmark diagnostic data sets. The studies show that in most cases the systematic search methods produce better diagnostic policies than the greedy methods. In addition, the studies show that for training sets of realistic size, the systematic search algorithms are practical on todays desktop computers

arXiv.org e-Print Archive

Crossref

The use of provenance in information retrieval

Author: Dietterich T. G.
Fitzhenry E.
Stumpf S.
Publication venue
Publication date: 01/01/2007
Field of study

The volume of electronic information that users accumulate is steadily rising. A recent study [2] found that there were on average 32,000 pieces of information (e-mails, web pages, documents, etc.) for each user. The problem of organizin

CiteSeerX

City Research Online

Deep Multi-instance Networks with Sparse Label Assignment for Whole Mammogram Classification

Author: C Varela
G Carneiro
H Greenspan
N Dhungel
T Kooi
TG Dietterich
W Shen
Z Jiao
Z Yan
Publication venue
Publication date: 23/05/2017
Field of study

Mammogram classification is directly related to computer-aided diagnosis of breast cancer. Traditional methods rely on regions of interest (ROIs) which require great efforts to annotate. Inspired by the success of using deep convolutional features for natural image analysis and multi-instance learning (MIL) for labeling a set of instances/patches, we propose end-to-end trained deep multi-instance networks for mass classification based on whole mammogram without the aforementioned ROIs. We explore three different schemes to construct deep multi-instance networks for whole mammogram classification. Experimental results on the INbreast dataset demonstrate the robustness of proposed networks compared to previous work using segmentation and detection annotations.Comment: MICCAI 2017 Camera Read

arXiv.org e-Print Archive

Crossref

Boosting parallel perceptrons for label noise reduction in classification problems

Author: D. Meyer
N. Nilsson
P. Auer
R.E. Schapire
T. Dietterich
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2005
Field of study

The final publication is available at Springer via http://dx.doi.org/10.1007/11499305_60Proceedings of First International Work-Conference on the Interplay Between Natural and Artificial Computation, IWINAC 2005, Las Palmas, Canary Islands, Spain, June 15-18, 2005Boosting combines an ensemble of weak learners to construct a new weighted classifier that is often more accurate than any of its components. The construction of such learners, whose training sets depend on the performance of the previous members of the ensemble, is carried out by successively focusing on those patterns harder to classify. This fact deteriorates boosting’s results when dealing with malicious noise as, for instance, mislabeled training examples. In order to detect and avoid those noisy examples during the learning process, we propose the use of Parallel Perceptrons. Among other things, these novel machines allow to naturally define margins for hidden unit activations. We shall use these margins to detect which patterns may have an incorrect label and also which are safe, in the sense of being well represented in the training sample by many other similar patterns. As candidates for being noisy examples we shall reduce the weights of the former ones, and as a support for the overall detection procedure we shall augment the weights of the latter ones.With partial support of Spain’s CICyT, TIC 01–572, TIN 2004–0767

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Biblos-e Archivo

Adaptive Anomaly Detection via Self-Calibration and Dynamic Updating

Author: D. Wolpert
H. Ringberg
K. Wang
K. Wang
L. Breiman
T. Pietraszek
T.G. Dietterich
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2009
Field of study

The deployment and use of Anomaly Detection (AD) sensors often requires the intervention of a human expert to manually calibrate and optimize their performance. Depending on the site and the type of traffic it receives, the operators might have to provide recent and sanitized training data sets, the characteristics of expected traffic (i.e. outlier ratio), and exceptions or even expected future modifications of system's behavior. In this paper, we study the potential performance issues that stem from fully automating the AD sensors' day-to-day maintenance and calibration. Our goal is to remove the dependence on human operator using an unlabeled, and thus potentially dirty, sample of incoming traffic. To that end, we propose to enhance the training phase of AD sensors with a self-calibration phase, leading to the automatic determination of the optimal AD parameters. We show how this novel calibration phase can be employed in conjunction with previously proposed methods for training data sanitization resulting in a fully automated AD maintenance cycle. Our approach is completely agnostic to the underlying AD sensor algorithm. Furthermore, the self-calibration can be applied in an online fashion to ensure that the resulting AD models reflect changes in the system's behavior which would otherwise render the sensor's internal state inconsistent. We verify the validity of our approach through a series of experiments where we compare the manually obtained optimal parameters with the ones computed from the self-calibration phase. Modeling traffic from two different sources, the fully automated calibration shows a 7.08% reduction in detection rate and a 0.06% increase in false positives, in the worst case, when compared to the optimal selection of parameters. Finally, our adaptive models outperform the statically generated ones retaining the gains in performance from the sanitization process over time

CiteSeerX

Crossref

Columbia University Academic Commons

A New Pairwise Ensemble Approach for Text Classification

Author: L. Breiman
M.I. Jordan
R. Schapire
T. Hastie
T. Joachims
T.G. Dietterich
Y. Yang
Y. Yang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2003
Field of study

Crossref

Fast Reinforcement Learning with Large Action Sets Using Error-Correcting Output Codes for MDP Factorization

Author: C. Dimitrakakis
D. Negoescu
G. Tesauro
J.L. Bentley
K. Crammer
S. Bubeck
T. Dietterich
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

International audienceThe use of Reinforcement Learning in real-world scenarios is strongly limited by issues of scale. Most RL learning algorithms are unable to deal with problems composed of hundreds or sometimes even dozens of possible actions, and therefore cannot be applied to many real-world problems. We consider the RL problem in the supervised classification framework where the optimal policy is obtained through a multiclass classifier, the set of classes being the set of actions of the problem. We introduce error-correcting output codes (ECOCs) in this setting and propose two new methods for reducing complexity when using rollouts-based approaches. The first method consists in using an ECOC-based classifier as the multiclass classifier, reducing the learning complexity from O(A2) to O(Alog(A)) . We then propose a novel method that profits from the ECOC's coding dictionary to split the initial MDP into O(log(A)) separate two-action MDPs. This second method reduces learning complexity even further, from O(A2) to O(log(A)) , thus rendering problems with large action sets tractable. We finish by experimentally demonstrating the advantages of our approach on a set of benchmark problems, both in speed and performance

arXiv.org e-Print Archive

HAL - Lille 3

Crossref

INRIA a CCSD electronic archive server

Error-Correcting Tournaments

Author: A.C. Yao
B. Ravikumar
E. Allwein
J. Fox
J. Langford
P. Denejko
T. Dietterich
Publication venue
Publication date: 01/01/2008
Field of study

We present a family of pairwise tournaments reducing

k

-class classification to binary classification. These reductions are provably robust against a constant fraction of binary errors. The results improve on the PECOC construction \cite{SECOC} with an exponential improvement in computation, from

O(k)

O(\log_2 k)

, and the removal of a square root in the regret dependence, matching the best possible computation and regret up to a constant.Comment: Minor wording improvement

arXiv.org e-Print Archive

CiteSeerX

Crossref

Database Architecture (R)evolution:New Hardware vs. New Software

Author: Argyros T.
Boncz P.A. (Peter)
Dietterich D.
Harizopoulos S.
Madden S. (Samuel)
Waas F.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/03/2010
Field of study

CWI's Institutional Repository